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AUTOMATED PROCESS LINES 
RELATED APPLICATIONS 

Benefit of priority to U.S. application Serial No. 09/285,481 to Hubert 
Koster, Ping Yip, Jhobe Steadman, Dirk Reuter and Richard MacDonald, filed 
5 April 2, 1999, entitled "AUTOMATED PROCESS LINE" is claimed. Where 

permitted, the subject matter of this application is incorporated by reference in 
its entirety. 

FIELD OF THE INVENTION 

Provided herein are systems and methods using the systems for 
10 performing high throughput analyses of biopolymers. 
BACKGROUND OF THE INVENTION 

In recent years, developments in the field of life sciences have proceeded 
at a breathtaking rate. Ground breaking scientific discoveries and advances in 
such fields as genomics (sequencing and characterization of genetic information 
and analysis of the relationship between gene activity and cell function) and 
proteomics (systematic analysis of protein expression in tissues, cells and 
biological systems) promise to reshape the fields of medicine, agriculture and 
envronmental science. The success of these efforts depends, in part, on the 
development of sophisticated .aboratory tools that will automate and expedite 
20 the testing and analysis of biological samples. 

Current methods of testing typically employ multiple instruments for 
preparing and analyzing samples and involve multiple manual handling steps and 
transfers. Such procedures are labor-intensive, time-consuming, and costly and 
they are susceptible to human error, sample contamination, and loss. After 
samples have been prepared, they can be subjected to testing procedures that 
produce data for analysis. Conventional testing procedures often must be 
performed by an individual laboratory technician, one sample at a time. 
Laboratory technicians are typically individuals who are most likely trained to 
operate only a single instrument. Automation will reduce the number of 
personnel and training necessary to carry out the research. Reliable and 
accurate automated process and analysis tools are necessary for the benefits of 
recent scientific discoveries to be fully achieved. 
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Genomic research is increasing the availability of genomic markers that 
can be used for the identification of all organisms, including humans. These 
markers (all genetic loci including SNPs, microsatellites and other noncoding 
genomic regions) provide a way to not only identify populations but also allow 
stratification of populations according to their response to drug treatment, 
resistance to environmental agents, and other factors. Importantly, the 
identification of the large number of genomic markers has become the driving 
force behind the development of new automated technologies. 

At the forefront of the efforts to develop better analytical tools are efforts 
to expedite the analysis of complex biochemical structures. For example, robotic 
devices have been employed to assist in sample preparation and handling. 

Such automated sample preparation systems could find application is the 
areas of: identification and validation of disease-causing genes or drug targets; 
defining mutations and polymorphisms, associated with specific diseases; 
monitoring gene expression and comparing disease states, cell cycles or other 
changes; genetic profiling of patients for responsiveness to genomics-based 
therapies; and genetic profiling of subjects in drug clinical studies to link 
response with genotype. 

The utility of genomic markers to identify and stratify populations is 
depending on the industry's ability to measure great numbers (100-100,000) of 
markers in large populations. This approach is extremely limited in terms of time 
and research costs. Automation of these systems provides advantages such as 
increasing throughput and accuracy, but miniaturization also is an important 
consideration in terms of research costs. Accordingly, there is a need to 
automate processes in which very small volumes are handled, and retain the 
accuracy of the results to permit their use in high throughput screening protocols 
and diagnostics. 

Therefore it is an object herein to provide automated systems and 
methods for high-throughput analysis of biological samples, particularly samples 
of very small volume, for screening, diagnosis and other procedures. Other 
objects will become apparent from the following disclosure. 
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SUMMARY OF THE INVENTION 

Provided herein is a fully automated modular analytical system that 
integrates sample preparation, instrumentation, and analysis of biopolymer 
samples. The samples include, but are not limited to, those containing 
biopolymers, such as but are not limited to, nucleic acids, proteins, peptides, 
carbohydrates, PNA (peptide nucleic acids), biopolymer (nucleic acid/peptide) 
analogs, and libraries of combinatorial molecules. Samples of interest include 
biological samples, such as but are not limited to, tissue and body fluid samples 
from humans and other mammals. 

The system integrates analytical methods of detection and analysis, 
including but are not limited to, mass spectrometry, radiolabeling, mass tags, 
chemical tags, fluorescence and chemiluminescence, with robotic technology 
and automated chemical reaction systems to provide a high-throughput, accurate 
system called an Automated Process Line (APL). The systems and methods 
provided herein are particularly suited for handling very small volumes, on the 

order of m iters - microliters and submicroliters, nanoliters and even smaller 
picoliter volumes. 

In certain embodiments, the analytical system includes one portion that is 
a contamination-controlled environment, such as a clean room or laminar flow 
room, and includes a means, such as a transporter, for moving the samples from 
such environment into a second room or space for further processing. This dual 
space system permits performance of procedures that require clean room 
conditions to be automatedly linked to procedures that do not require such 
conditions. The systems are particularly useful for analysis of nucleic acid and 
protein samples, such as detection of polymorphisms, particularly single 
nucleotide polymorphisms, and particularly, using mass spectrometric analyses. 

Integrated systems, such as the system exemplified herein and 
designated an Automated Process Line (APL), for performing a reactions is 
provided. It includes a process line that has a plurality of processing stations, 
each of which performs a procedure on a biological sample contained in a 
reaction vessel; a robotic system that transports the reaction vessel from 
processing station to processing station; a control system that determines when 
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the procedure at each processing station is complete and, in response, moves 
the reaction vessel to the next test station, and continuously processes reaction 
vessels one after another until the control system receives a stop instruction- 
and a data analysis system that receives test results of the process line and 
5 automatically processes the test results to make a determination regarding the 
biological sample in the reaction vessel is provided. 

The automated system can run unattended continuously with a 
continuous sample throughput and is capable of analyzing on the order of 
10,000-50,000 genotypes per day. The results are highly accurate and 
10 reproducible. 

Also provided herein are methods for automated analysis of biopolymers 
us.ng the integrated automated highthroughput system. In preferred 
embodiments, provided are automated methods for preparing a biological sample 
for analysis; introducing the sample into an analytica. instrument; recording 
15 sample data; automatically processing and interpreting the data; and storing the 
data ,n a bioinformatics database. In a particular embodiment, patient DNA 
samples are automatically analyzed to determine genotype. 
BRIEF DESCRIPTION OF THE DRAWINGS 

HQ. 1 is a diagram of the components of the automated process line 
in FIG T' 2 Sh ° WS " m39netiC C ° nStrUCti0n ° f the ma 9 ne ^ l»t illustrated 
FIG. 3 shows a point-magnet construction of the magnetic lift illustrated 

1 



in FIG. 1. 
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FIG. 4 shows the robotic interface between the chip processor and the 
mass spectrometer of the automated process line illustrated in FIG. 1. 

FIG. 5 shows a comparison of a mass spectrum of a test sample with 
stored spectra from samples with known genotypes. 

HQ. 6 is a flow diagram that illustrates the data analysis processing steps 
performed by the automated process line of FIG. 1 . 

FIG. 7 shows an example of the user interface to the automated high 
throughput system (the A PL automated system). 



WO 00/60361 



PCT/US00/08111 



-5- 



FIG. 8 shows an. example of the interface to a database of experimental 
mass spectral data. 

DETAILED DESCRIPTION AND PREFERRED EMBODIMENTS 

Definitions 

5 Unless defined otherwise, all technical and scientific terms used herein 

have the same meaning as is Common , y understood by one of ski,, in the art to 
wh,ch this invention belongs. All patents, patent applications and publications 
referred to throughout the disclosure, including the background, herein are 
unless noted otherwise, incorporated by reference in their entirety. In the event 
10 a dafmrtion in this section is not consistent with definitions elsewhere the 
definition set forth in this section will control. 

As used herein, a molecule refers to any molecule or compound that is 
l-nked that is analyzed by the methods contempated herein. For purposes 
here.n, the molecules are often linked to a solid support, such as a bead 
1 5 Typ,ca.,y such molecules are bio.ogica. partic.es or macromo.ecules or " 

components or precursors thereof, such as peptides, proteins, small organics 
o.igonuc.eotides or monomeric units of the peptides, organics. nucleic acids and 
other rnacromo.ecu.es. A monomeric unit refers to one of the constituents from 
wh.ch the resulting compound is built. Thus, monomeric units inc.ude, but are 
not ..mited to, nucleotides, amino acids, and pharmacophores from which smal. 
organic molecules are synthesized. 

w„iah,C S T"' m " Cr ° m0leCUte '° ™*c* having . m(>lecular 

pro,.,„ s nucleotide nucleic acids, and „ ther sucn motecules are ' 
25 syn,nes, 2 ed by „,o,oo,c a , o rga „ isms , but can be prepafed ^ V 

recombinant molecular biology methods. 

As used herein, a biologica. particle refers to a virus, such as a vira. 
vector or vira, capsid with or without packaged nudeic acid, phage, inching a 
Phage vector or phage capsid, with or without encapsu.ated nuc.eotide acid a 
smg.e ce„, including eukaryotic and prokaryotic cells or fragments thereof a 
..posome or micel.ar agent or other packaging particle, and other such bio.ogica, 
matena.s. For purposes herein, bio.ogica. partic.es inc.ude molecu.es that are not 
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typically considered macromoleculp^ h OM „„„ 

but * r * H» • h < om o'ecules because they are not generally synthesized, 

but are derived from cells and viruses. 

As used herein, the term "nucleic acid" refers to single-stranded and/or 
double-stranded polynucleotides such as deoxyribonucieic acid (DNA) and 
5 ribonucleic acid (RNAI a* w,«n ■ A| ' and 

(RNA) as well as analogs or derivatives of either RNA or DNA 
Also included in the term "nucleic arirt- „ , 

„ C ' d are ana| ogs of nucleic acids such as 

As used herein, the term -biological . 
10 ob,,„ed from an y livin9 source . inc , udin9 bu[ we nM ™ 

1 = add 7T herei "' *" b ' 0,09i "' ~"* P « ««*• ■ 

r :i ,r ~ bi °'° 3, ° 81 — — - 

spinal fluid and other body fluids). cerebral 

termin T ^ """"^ ^-^'"^ nucleotides- and "chain 
term,n„,n 9 nuclides- sr. used in accordance with ,hei, ar, recorded 
mea„,n 9 Fo r example, for DNA, chain-.,on 9 a,in g nucleotides in ud 
■0 2 deoxynbonucleotides ^ dATP, dCTP. dGTP and dTTP, and chail 
term.nat.ng nucleotides include r 3' din 

ddCTP. ddGTP, ddTTP, For ^ ' ' It*. ddATP. 
,ih„ , . ' """"""'"Wino, nucleotides include 

nbonucleondes (e^, ATP, CTP, OTP and UTPI anr, k ■ 

include 3--deox„ibo„uc,eo,,des ^ A « 'tr"""''" 9 nUC ' eOMe5 
- - - -n e,on 9 a,in 9 nucleotidest:': JZ'ZIZZ 

term -nucleotide- is also well known in ,„. .„. The 
As used herein, nucleotides include nucleoside mono- di- and 

Z SET Nucleo " aes a,s ° include modi,led ~- «* « 

pnosphorothioate nucleotidPQ an n w« 

chein-elongatino r^J^^ZZ ' ~ " 

l . . erers To tour different nucleotides that can 

,„ each o, the four different bases comprisin 9 the J!ZZ.. 
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As used herein, "multiplexing" refers to the simultaneously detection of 
more than one analyte, such as more than one (mutated) loci on a particular 
captured nucleic acid fragment (on one spot of an array). 

As used herein, the term "biopolymer" is used to mean a biological 
molecule composed of two or more monomeric subunits, or derivatives thereof, 
which are linked by a bond or a macromolecule. A biopolymer can be, for 
example, a polynucleotide, a polypeptide, a carbohydrate, or a lipid, or 
derivatives or combinations thereof, for example, a nucleic acid molecule 
containing a peptide nucleic acid portion or a glycoprotein, respectively. The 
methods and systems herein, though described with reference to biopolymers, 
can be adapted for use with other synthetic schemes and assays, such as 
organic syntheses of pharmaceuticals, or inorganics and any other reaction or 
assay performed on a solid support or in a well in nanoliter volumes. 

As used herein, the term "nucleic acid" refers to single-stranded and/or 
double-stranded polynucleotides such as deoxyribonucleic acid (DNA), and 
ribonucleic acid (RNA) as well as analogs or derivatives of either RNA or DNA. 
Also included in the term "nucleic acid" are analogs of nucleic acids such as 
peptide nucleic acid (PNA), phosphorothioate DNA, and other such analogs and 
derivatives. 

As used herein, the term "polynucleotide" refers to an oligomer or 
polymer containing at least two linked nucleotides or nucleotide derivatives, 
including a deoxyribonucleic acid (DNA), a ribonucleic acid (RNA), and a DNA or 
RNA derivative containing, for example, a nucleotide analog or a "backbone- 
bond other than a phosphodiester bond, for example, a phosphotriester bond a 
phosphoramidate bond, a phophorothioate bond, a thioester bond, or a peptide 
bond (peptide nucleic acid). The term "oligonucleotide" also is used herein 
essentially synonymously with "polynucleotide," although those in the art will 
recognize that oligonucleotides, for example, PCR primers, generally are less 
than about fifty to one hundred nucleotides in length. 

Nucleotide analogs contained in a polynucleotide can be, for example, 
mass modified nucleotides, which allows for mass differentiation of 
polynucleotides; nucleotides containing a detectable label such as a fluorescent 



WO 00/60361 



PCT/USOO/08111 



-8- 



15 



20 



25 



30 



radioactive, luminescent, or chemiluminescent label, which allows for detection of 
a polynucleotide; or nucleotides containing a reactive group such as biotin or a 
thiol group, which facilitates immobilization of a polynucleotide to a solid 
support. A polynucleotide also can contain one or more backbone bonds that 
5 are selectively cleavable, for example, chemically, enzymatically or 
photolytically. For example, a polynucleotide can include one or more 
deoxyribonucleotides, followed by one or more ribonucleotides, which can be 
followed by one or more deoxyribonucleotides. such a sequence being cleavable 
at the ribonucleotide sequence by base hydrolysis. A polynucleotide also can 
' contain one or more bonds that are relatively resistant to cleavage, for example 
a ch.meric oligonucleotide primer, which can include nucleotides linked by 
peptide nucleic acid bonds and at least one nucleotide at the 3' end which is 
linked by a phosphodiester bond, or the like, and is capable of being extended by 
a polymerase. Peptide nucleic acid sequences can be prepared using well known 
methods (see, for example, Weiler eta/., Nucleic acid., R»c 25-2792-2799 
(1997)). 

A polynucleotide can be a portion of a larger nucleic acid molecule for 
example, a portion of a gene, which can contain a polymorphic region, or a 
port,on of an extragenic region of a chromosome, for example, a portion of a 
reg.on of nucleotide repeats such as a short tandem repeat (STR, (ocus a 
variable number of tandem repeats (VNTR, locus, a microsatellite locus or a 
mm.satellite locus. A polynucleotide also can be single stranded or doub.e 
stranded, including, for example, a DNA-RNA hybrid, or can be triple stranded or 
four stranded. Where the polynucleotide is doub.e stranded DNA, it can be in an 
A, B. L or 2 configuration, and a single polynucleotide can contain combinations 
of such configurations. 

As used herein, the term "polypeptide," means at least two amino acids 
or ammo acid derivatives, including mass modified amino acids and amino acid 
analogs, that are linked by a peptide bond, which can be a modified peptide 
bond. A polypeptide can be translated from a polynucleotide, which can include 
at least a portion of a coding sequence, or a portion of a nucleotide sequence 
that ,s not naturally translated due, for example, to it being .ocated in a reading 
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frame other than a coding frame, or it being an intron sequence, a 3' or 5' 
untranslated sequence, a regulatory sequence such as a promoter, or the like. A 
polypeptide also can be chemically synthesized and can be modified by chemical 
or enzymatic methods following translation or chemical synthesis. The terms 
5 "polypeptide," "peptide" and "protein" are used essentially synonymously 

herein, although the skilled artisan will recognize that peptides generally contain 
fewer than about fifty to one hundred amino acid residues, and that proteins 
often are obtained from a natural source and can contain, for example, post- 
translational modifications. A polypeptide can be post-translationally modified by 
10 phosphorylation (phosphoproteins), glycosylation (glycoproteins, proteoglycans), 
and the like, which can be performed in a cell or in a reaction in vitro. 

As used herein, the term "conjugated" refers stable attachment, 
preferably ionic or covalent attachment. Among preferred conjugation means 
are: streptavidin- or avidin- to biotin interaction; hydrophobic interaction; 
magnetic interaction (e^. using functional magnetic beads, such as 
DYNABEADS, which are streptavidin-coated magnetic beads sold by Dynal, Inc 
Great Neck, NY and Oslo Norway); polar interactions, such as "wetting- 
associations between two polar surfaces or between oligo/polyethylene glycol- 
formation of a covalent bond, such as an amide bond, disu.fide bond, thioether 
bond, or via crosslinking agents; and via an acid-labile or photocleavable linker. 

As used herein equivalent, when referring to two sequences of nucleic 
acids means that the two sequences in question encode the same sequence of 
am.no acids or equivalent proteins. When "equivalent" is used in referring to 
two proteins or peptides, it means that the two proteins or peptides have 
substantial the same amino acid sequence with only conservative amino acid 
substitutions that do not substantially alter the activity or function of the protein 
or peptide. When "equivalent" refers to a property, the property does not need 
to be present to the same extent [e^., two peptides can exhibit different rates 
of the same type of enzymatic activity], but the activities are preferably 
substantially the same. "Complementary," when referring to two nucleotide 
sequences, means that the two sequences of nucleotides are capable of 
hybridizing, preferably with less than 25%, more preferably with less than 15% 
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even more preferably with less than 5%, most preferably with no mismatches 
between opposed nucleotides. Preferably the two molecules will hybridize under 
conditions of high stringency. 

As used herein: stringency of hybridization in determining percentage 
mismatch are those conditions understood by those of skill in the art and 
typically are substantially equivalent to the following: 

1) high stringency: 0.1 x SSPE, 0.1% SDS, 65 °C 

2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C 

3) low stringency: 1.0 x SSPE, 0.1% SDS, 50°C 

It is understood that equivalent stringencies may be achieved using alternative 
buffers, salts and temperatures. 

As used herein, a primer when set forth in the claims refers to a primer 
suitable for mass spectrometry methods requiring immobilizing, hybridizing, 
strand displacement, sequencing mass spectrometry refers to a nucleic acid 
must be of low enough mass, typically about 70 nucleotides or less than 70, and 
of sufficient size to be useful in the mass spectrometry methods described 
herein that rely on mass spectrometry detection. These methods include 
primers for detection and sequencing of nucleic acids, which require a sufficient 
number nucleotides to from a stable duplex, typically about 6-30, preferably 
about 10-25, more preferably about 12-20. Thus, for purposes herein a primer 
will be a sequence of nucleotides comprising about 6-70, more preferably a 12- 
70, more preferably greater than about 14 to an upper limit of 70, depending 
upon sequence and application of the primer. The primers herein, for example for 
mutational analyses, are selected to be upstream of loci useful for diagnosis 
such that when performing using sequencing up to or through the site of 
interest, the resulting fragment is of a mass that sufficient and not too large to 
be detected by mass spectrometry. For mass spectrometry methods, mass 
tags or modifier are preferably included at the 5'-end, and the primer is 
otherwise unlabeled. 
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As used herein, "conditioning" of a nucleic acid refers to modification of 
the phosphodiester backbone of the nucleic acid molecule (e^, cation 
exchange) for the purpose of eliminating peak broadening due to a heterogeneity 
.n the canons bound per nucleotide unit. Contacting a nucleic acid molecule 
5 with an alkylating agent such as akyliodide, iodoacetamide, /?-iodoethanol or 
2,3-e P oxy-1-propanol, the monothio phosphodiester bonds of a nucleic acid 
molecule can be transformed into a phosphotriester bond. Likewise 
Phosphodiester bonds may be transformed to uncharged derivatives'emp.oying 
tn.lkyl.Hyl chlorides. Further conditioning involves incorporating nucleotides that 
10 reduce sensitivity for depurination (fragmentation during MS) e^, a purine 
ana.og such as N7- or N9-dea 2 a P urine nucleotides, or RNA building blocks or 
us.ng oligonucleotide triesters or incorporating phosphorothioate functions that 
are alkylated or employing oligonucleotide mimetics such as peptide nucleic acid 

1 5 As used herein, the term "solid support" means a non-gaseous, non-li quid 

matenal having a surface. Thus, a solid support can be a flat surface 
constructed, for example, of glass, silicon, metal, plastic or a composite; or can 
be in the form of a bead such as a silica gel, a controlled pore glass, a magnetic 
or ce.lulose bead; or can be a pin, including an array of pins suitable for 
20 combinatorial synthesis or analysis. 

As used herein, substrate refers to an insoluble support onto which a 
sample is deposited according to the materials described herein. Sample may be 
linked directly or via a linker and retained by any suitable means, including 
cova,ent, ionic and other bonds and other interactions. Examples of appropriate 
25 substrates include solid supports, but are not limited to, silica ge., controls 
Pore glass, magnetic beads, agaroase g els and crosslinked dextroses (ie. 
Sepharose and Sephadex,, cellulose and other materials known to those of skill 
m the art, to serve as so.id support matrices. For example, substrates may be 
formed from any or combinations of: silica gel, glass, magnetic materials 
po.ystyrene/1 % divinylbenzene resins, such as Wang resins, which are ' 
Fmoc-amino acid-4-(hydroxymethy,)phenox y methylco P oly(styrene-1 % 
d,viny.benzene |DVD» resin, chlorotrity. (2-ch.orotrity.ch.oride copo.y- 
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styrene-DVB resin) resin, Merrifield (chloromethylated copolystyrene-DVB) resin 
metal, plastic, cellulose, cross-linked dextrans, such as those sold under the 
trade name Sephadex (Pharmacia) and agarose gel, such as gels sold under the 
trade name Sepharose (Pharmacia), which is a hydrogen bonded 
5 polysaccharide-type agarose gel, and other such resins and solid phase supports 
known to those of skill in the art. The support matrices may be in any shape or 
form, including, but not limited to: capillaries, flat supports such as glass fiber 
filters, glass surfaces, metal surfaces (steel, gold, silver, aluminum, copper and 
silicon), plastic materials including multiwell plates or membranes (e.g., of 
10 polyethylene, polypropylene, polyamide, polyvinylidenedifluoride), pins (e.g., 

arrays of pins suitable for combinatorial synthesis or analysis or beads in pits of 
flat surfaces such as wafers (e.g., silicon wafers) with or without plates, and 
beads. The supports include any supports used for retaining or conjugating 
macromolecules and biopolymers, and biological particles. 
1 5 As used herein, a selectively cleavable linker is a linker that is cleaved 

under selected conditions, such as a photocleavable linker, a chemically 
cleavable linker and an enzymatically cleavable linker (i.e., a restriction 
endonuclease site or a ribonucleotide/RNase digestion). The linker is interposed 
between the support and immobilized DNA. 
20 As used herein, the term "liquid dispensing system" means a device that 

can transfer a predetermined amount of liquid to a target site. The amount of 
liquid dispensed and the rate at which the liquid dispensing system dispenses the 
liquid to a target site, which can contain a reaction mixture, can be adjusted 
manually or automatically, thereby allowing a predetermined volume of the liquid 
25 to be maintained at the target site. 

As used herein, the term "liquid" is used broadly to mean a non-solid, 
non-gaseous material, which can be homogeneous or heterogeneous and can 
contain one or more solid or gaseous materials dissolved or suspended therein. 
In general, a liquid is a component of a reaction mixture that is susceptible to 
30 evaporation under the conditions of the reaction. In particular, the liquid can be 
a solvent, in which a reaction is performed, for example water or glycerol/water 
or buffer or reaction mixture, where the reaction is performed in an aqueous 
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solution. The liquid can.be any non-solid, non-gaseous solvent or other 
component of a reaction mixture that is susceptible to evaporative loss, for 
example, acetonitrile, which can be a solvent for a nucleic acid synthesis 
reaction; formamide, which can be a liquid component of a nucleic acid 
5 hybridization reaction; pyridine, which is a liquid component of a nucleic acid 
sequencing reaction; or any other non-aqueous solvent or other liquid 
component. A liquid can contain dissolved or suspended components, which 
can be useful, for examp.e, for initiating, terminating or changing the conditions 
of a reaction, thereby facilitating the performance of sing.e tube reactions. 
10 As used herein, the term "reaction mixture" refers to any mixture 

including so.utions and suspension, in which a chemical, physical or biological 
change is effected. In general, a change to a molecule is effected, although 
changes to cells also are contemplated. A reaction mixture can contain a 
solvent, which provides, in part, appropriate conditions for the change to be 
effected, and a substrate, upon which the change is effected. A reaction 
m,xture also can contain various reagents, including buffers, salts, and metal 
cofactors, and can contain reagents specific to a reaction, for example 
enzymes, nuc.eoside triphosphates, amino acids, and the like. For convenience 
reference is made herein generally to a "component" of a reaction, wherein the ' 
component can be a cel. or mo.ecule present in a reaction mixture, including, for 
example, a biopolymer or a product thereof. 

As used herein, the term "target site" refers to a specific locus on a solid 
support that can contain a liquid. A solid support contains one or more tapget 
s.tes, which can be arranged randomly or in ordered array or other pattern. In 
25 particular, a target site restricts growth of a liquid to the "z" direction of an xyz 
coordinate. Thus, a target site can be, for examp.e, a wel. or pit, a pin or bead 
or a physical barrier that is positioned on a surface of the solid support or 
combinations thereof such as a beads on a chip, chips in wells, or the like A 
target site can be physically p.aced onto the support, can be etched on a surface 
of the support, can be a "tower" that remains following etching around a .ecus 
or can be defined by physico-chemical parameters such as relative hydrophi.icity 
hydrophobic^, or any other surface chemistry that allows a liquid to grow 
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primarily in the z direction. A solid support can have a single target site, or can 
contain a number of target sites, which can be the same or different, and where 
the solid support contains more than one target site, the target sites can be 
arranged in any pattern, including, for example, an array, in which the location of 
5 each target site is defined. 

As used herein, the term "predetermined volume" is used to mean any 
desired volume of a liquid. For example, where it is desirable to perform a 
reaction in a 5 microliter volume, 5 microliters is the predetermined volume. 
Similarly, where it is desired to deposit 200 nanoliters at a target site, 
10 200 nanoliters is the predetermined volume. 

As used herein, a small volume, typically refers to a volume on the order 
of nanoliters, preferably less than 1 microliter and typically, less than 0.5 
microliters and less. The term nanoliter volume refers to a volume of about 0.1 
to about 1000 nanoliters, preferably about 1 to 100 nanoliters. 
15 As used herein, symbology refers to the code, such as a bar code, that is 

engraved or imprinted on a surface. The symbology is any code known or 
designed by the user. 

As used herein, a bar codes refers a symbology that is any array of, 
preferably, optically readable marks of any desired size and shape that are 
20 arranged in a reference context or frame of, preferably, although not necessarily, 
one or more columns and one or more rows. For purposes herein, the bar code 
refers to any symbology, not necessary "bar" but may include dots, characters 
or any symbol or symbols. 

As used herein, the disclosed systems and methods generally are 
25 useful where the reaction volume is about 500 milliliters or less; are more useful 
where the reaction volume is about 5 milliliters or less; are most useful where 
the reaction volume is in the "submilliliter" range, for example, about 500 
microliters, or about 50 microliters or about 5 microliters or less; and are 
particularly useful where the reaction volume is a "submicroliter" reaction 
30 volume, which can be measured in nanoliters, for example, about 500 nanoliters 
or less, or 50 nanoliters or less or 10 nanoliters or less, or can be measured in 
picoliters, for example, about 500 picoliters or less or about 50 picoliters or less. 
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For convenience of discussion, the term "submicroliter" is used herein to refer to 
a reaction volume less than about one microliter, although it will be readily 
apparent to those in the art that the systems and methods disclosed herein are 
applicable to subnanoliter reaction volumes as well. 
5 As used herein, a "peak" or graph encompasses the visual depiction 

thereof and also the numerical values from which the visual depiction is 
generated. A mass spectrum, for example, does not have to be depicted as 
peaks, but can be depicted in other formats, including numerical values from 
which the graph of visual depiction is generated. 
10 As used herein, a room refers to a space, such as a room, chamber or a 

hood or other enclosure that is in some manner separated. In an embodiment 
herein, the system is designed to operate in two rooms, such that manipulations 
that require sterile conditions can be performed in one room or chamber. 
Manipulations that do not require such conditions can be performed in a second 
room. Samples can then be automatically transported between the first room 
and second room. As desired additional rooms, with conditions designed for a 
particular set of manipulations may be included in the system. 
Automated High Throughput Integrated System 

In the automated system constructed in accordance with the disclosure 
herein, one or more robotic systems under computer control are used to 
manipulate the sample of interest. The robot(s) are commanded by controlling 
software and move the sample between the series of reaction and sample 
preparation stations that comprise the system. The robot includes a robotic arm 
that moves, for example, along a track or on a central pivot, and is typically 
outfitted with a "gripper" arm, allowing it to grip reaction vessels and transport 
them between stations. Such robotic systems are commercially available and 
are commonly known to those of skill in the art. For example, a robotic system 
and accompanying software can be obtained from Robocon Labor-und 
Industrieroboter Ges.m.b.H of Austria ("Robocon"). In a preferred embodiment, 
the system is the APL automated system and it includes a Robocon "Model CRS 
A 255" robot, equipped with a "Digital Servo Gripper" mechanism, also available 
from Robocon. The robotic systems are designed such that they can be 
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integrated with other computer-controlled instrumentation to perform 
consecutive operations to effect a multi-step process. 

In the preferred embodiment, one robot moves along a central track in a 
contamination-controlled environment, such as a positive airflow or laminar f.ow 
chamber, to perform a series of manipulations or reactions on a biological 
sample. Once these steps are completed, the sample enters a second 
contamination-controlled environment, which serves as an antechamber into a 
non-sterile environment. The second environment can be sealed off from the 
first contamination-controlled environment and/or the non-sterile environment 
For example, in a particular embodiment, the sample is transported from the 
contamination-controlled laminar flow chamber into a transport chamber or so- 
called "taxicab." If desired, the taxicab can provide a sterile environment as 



well. 



15 



20 



25 



30 



Upon entry of the sample into the transport chamber, the contamination- 
controlled environment is sealed off. The sample then moves along a 
pneumatically-driven or motor-driven stage in the transport chamber, and the 
transport chamber then opens up into the, non-sterile environment, such as an 
open room. In the open room, a second robot, also moving along a central 
track, takes control of manipulating the sample. 

The sample to be analyzed is contained within a reaction vessel that is 
designed to integrate with all of the components of the system and that is 
amenable to the conditions of the chemical or biological reactions performed 
Preferred for high throughput analysis are reaction vessels that are capable of 
containing multiple samples, such as multi-well microtiter plates, preferably 96- 
well or 384-well plates or chips, such as silicon microchips. The reaction 
vessels also can comprise flat chips with reaction sites which are not wells but 
phy SI cal locations that contain the reaction using a chemical barrier. In certain 
embod.ments, the robot and/or gripper is adapted to hold a sample vessel For 
example, pins may be added to the gripper in alignment with the wells of a 
microtiter plate for transporting the sample. 

In high-throughput applications, where multiple sample plates are to be 
analyzed successively in an automated fashion, the samples can be held in a 
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sample storage system, or rack, where they are picked up by the system robot 
and processed. An example of such a sample storage system, for use with 
multi-well microliter plates, is the Robocon "Plate Cube" system. 

In steps where sample vessels are to be sealed, such as when subjected 
5 to PCR amplification, or unsealed, such as for reagent addition or removal, an 
automated lid application/removal and sealing system may be integrated into the 
system. Examples of these include a lid parking station, such as is available 
from Robocon, and a plate sealer, such as the "MJ Microseal", available from 
MJ Research. A system turntable might also be employed to assist the system 
10 robot in orienting the samples for delivery into each station of the system. Such 
a turntable is available, for example, from Robocon. Additionally, a shaker is 
also included in the system in embodiments where beads, solid supports or other 
reagents are added to the sample for immobilizing the sample, or where other 
manipulations requiring mechanical shaking are involved. 
1 5 in preferred embodiments, the sample plate or vessel is coded with a 

symbology, such as a bar code, which can be read by a reader, to allow sample 
tracking. In the preferred embodiment, separate bar code readers are contained 
in the contamination-controlled and non-sterile environments. Bar code systems, 
including one and two dimensional bar codes, readable and readable/writable 
20 codes and systems therefor, are widely available, such as from Datalogic S.p.A. 
of Italy ("Datalogic"), and are well known to those skilled in the art. 

Sample handling and reagent additions are accomplished using automated 
liquid handling systems. These include systems capable of automatically 
dispensing liquids into the sample vessel, such as through a pipette, and can be 
25 adapted to any sample format, such as a multiwell microtiter plate. Such 
systems are commercially available, such as from Tecan AG of Switzerland 
("Tecan") or Beckman Coulter, Inc. In a preferred embodiment, Tecan "Genesis 
200/8" (200 cm with including an 8-tip arm) liquid handling systems, as well as 
a Beckman Coulter "Multimek 96" automated pipettor are used for liquid hand- 
30 ling. Other liquid dispensing systems are described in allowed U.S. application 
Serial No. 08/787,639 now U.S. Patent No. 6,024,925, U.S. application Serial 
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No. 08/786,988, and published International PCT application No. WO 98/20166, 
which are incorporated herein by reference. 

Also present in the system may be an apparatus for preparing a test 
sample for analysis, including, for example, reagent addition means, or other 
i means for performing reactions or processes to prepare the sample for analysis. 
In certain preferred embodiments, where mass spectral analysis, specifically 
MALDI-TOF analysis, is to be performed using a sample array, a matrix material 
(Lfii. an organic acid) is added to the sample using an adapted piezoelectric 
pipetting dispensing system. The dispensing system includes a hydrophobic tip 
which is capable of dispensing submolar, preferably nanomolar, samples. Such 
systems, as well as methods for preparing and analyzing low volume analyte 
array elements, have been described in allowed U.S. Patent Application No. 
08/787,639, now U.S. Patent No. 6,024,925, U.S. application Serial No. 
08/786,988, and published International PCT application No. WO 98/20166 
see, also Little et al. Anal. Chem. 1997, 69, 4540-4546. the contents of which 
are incorporated by reference herein in their entirety. 

Alternatively, a system that dispenses liquid samples from the picoiiter up 
to the nanoliter range is commercially available, such as the "Nano-Plotter" 
product from GeSiM GmbH of Germany ("GeSiM"). In other embodiments 
reactions such as radiolabeling or adding a mass tag to the sample may be 
performed by the sample preparation apparatus. 

A sample may also be transferred to or placed in a particular sample 
analysis vessel for analysis. The particular type of sample analysis vessel used 
-s determined by the analytical method to be employed. For example, in a 
preferred embodiment, where mass spectrometry (MALDI-TOF) is used for 
analysis of a sample, a typica. sample vessel is a silicon microchip « 1 square 
-nch, that includes one or more, 100, 200, 300, 400, 500. up to 999 diagnostic 
s.tes (typically in multiples of 96. such as 96, 384 and higher densities), or even 
h.gher density, on a single chip, preferably in the pattern of a 2-D array The 
chip, or multiple chips, can then be placed on a sample platform, designed 
specmcally to be inserted into the mass spectrometer. 
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In a preferred embodiment, the analytical system is a MALDI-TOF mass 
spectrometer. A preferred mass spectrometer is manufactured by 
Bruker-Franzen Analytik GmbH of Germany ("Bruker") and uses a UV laser. In 
the spectrometer, a brief pulse of laser irradiation is absorbed by the matrix, 
5 leading to spontaneous volatization and ionization of the matrix and DNA 

fragments. The molecular weight of the gas-phase ions are then determined by 
measurement of the time-of-flight of ions, which is proportiona. to their mass 

It should be understood that the nature of the sample to be analyzed and 
the analysis to be performed, as well as the feasibility of automating a reaction 
10 process, determine the components integrated into the system, and the system 
.s not to be limited to the particular embodiments described herein. 

Module for performing the reaction in an unsealed environment 
Systems for performing a reaction in an unsealed environment are 
prov.ded in copending U.S. application Seria. No. 09/266,409, filed March 10 
1999 and International PCT application No. PCT/USOO/06288 filed 
March 10, 2000,. These systems may be integrated into the high throughput 
automated systems provided herein provided herein. Briefly the systems and 
methods provide a means for performing reactions in an unsealed environments 
such as in unsea.ed containers and on unsealed surfaces, by monitoring and 
ma,ntaining the reaction volume. The liquid generally ,. present on a surface of a 
sohd support, at a target site, and the environment into which evaporation can 
occur ,s a,r. The systems and methods provide a means to maintain a volume 
of a | iquid at a predetermined vo.ume, where the volume otherwise would 
decrease be,ow the predetermined volume due to evaporation. These systems 
include a support for performing the reaction; a nano.iter dispensing pipette for 
expensing an amount of a liquid onto the surface of the support; a temperature 
controlling device for regulating the temperature of the support; and means for 
controllmg the amount of liquid dispensed, wherein the amount of liquid 
d-spensed corresponds to the amount of liquid that evaporates from the support 
where.n the system is not sealed. Hence, inclusion of the systems for 
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performing the reactions in an unsealed environment are contemplated for 
inclusion as a module in the systems provided herein. 
Analytical methods 

The systems herein can be used to perform a number of different 
reactions, dependent upon the nature of the sample and the analysis to be 
performed. The system is typically used to perform analysis on biological 
samples, typically biopolymers, including nucleic acids, proteins, peptides and 
carbohydrates. Methods of analysis of the biological samples include all known 
methods of analysis, including, but not limited to mass spectrometry (all light 
wavelengths), radiolabeling, mass tags, chemical tags, fluorescence, and 
chemiluminescence, and particularly mass spectrometry. 

In a preferred embodiment, the sample is a purified previously amplified 
portion of genomic DNA or genomic DNA sample. For analysis of DNA samples, 
reactions such as nucleic acid amplification (e^, PCR, ligase chain reaction) and 
enzymatic reactions, such as primer oligonucleotide base extension (PROBE), 
nested PCR or sequencing, may be performed. In addition, the apparatus can be 
used for hybridization (sequencing and diagnostic) reactions, and endo- and 
exonuclease mapping of biopolymers. 

In certain embodiments, the sample may be immobilized on a solid 
support during all or part of the automated process. For example, enzymatic 
reactions, including diagnostics, such as a method designated primer oligo base 
extension (PROBE; see, e.g., published International PCT application No. WO 
98/20019 and U.S. Patent No. 6,043,031), nested PCR, sequencing, and other 
analytical and diagnostic procedures that are performed on solid supports (see, 
> e.g., U.S. Patent No. 5,605,798). Briefly PROBE uses a single detection primer 
followed by an oligonucleotide extension step to give products, which can be 
readily resolved by MALDI-TOF mass spectrometry. The products differ in 
length by a number of bases specific for a number of repeat units or for second 
site mutations within the repeated region. The method is exemplified using as a 
model system the AluVpA polymorphism in intron 5 of the interferon-a receptor 
gene located on human chromosome 21 , and the poly T tract of the splice 
acceptor site of intron 8 from the CFTR gene located on human chromosome 7. 
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The method is advantageously used for example, for determining identity 
•dent.fying mutations, familial relationship, HLA compatibility and other such 
markers .using PROBE-MS analysis of microsatellite DNA. In a preferred 
embodiment, the method includes the steps of a, obtaining a bio.ogica. sample 
5 from two individuals; b, amplifying a region of DNA from each individual that 
conta.ns two or more microsatellite DNA repeat sequences- c) 
ion*ing,vo<ati<izing tne amp|jfjed DNA; d, detecting the presence of the amplified 
DNA and comparing the molecular weight of the amplified DNA. Different sizes 
are .ndicative of non-identity (Le, wild-type versus mutation,, non-heredity or 
10 non-compatibility; similar size fragments indicate the possibility identity of 
familial relationship, or HLA compatibly. More than one marker may be 
examined simultaneous, primers with different linker moieties are used for 
immobilization. 

As noted so.id supports include, but are not limited to, flat surfaces 

m.rot.er plates, beads, wafers, chips, and silicon support. Compositions and 
me ho ds for immobi|j2| . ng nuc|ejc acjds ^ sojjd suppQrts ^ ^^^^ 

h.gh dens,ty immobi.ization of nucleic acids are described in U.S Patent 
Application Nos. 08/746,055 and 08/947,801 and published Internationa. PCT 
app„cation No. WO 98,20,66. Linkers for immobilizing nuc.eic acids to so.id 
•0 supports are we,, known. Linkers may be reversisb.e or irreversible. A target 
detects site can be directly .inked to a so.id support via a reversible or 

ZIT' "r^^ ^ 3PPrOPriate fUnCti ° nality ,L '' ° n the — < 

(RGURP Tm " aPPrOPri3te fUnCti ° nalitV (U ° n *" ™<-.e 

(F.QURE IB). A reversible .inkage can be such that it is cleaved under the 

cond,t,on. of mass spectrometry (i.e., a photocleavab.e bond such as a charge 

t^fer complex or a labile bond being formed between relatively stable organic 

Photocleavable linkers arp lini^r*-. 

n K ers are l.nkers that are cleaved upon exposure to light 

(see,e^, Goldmacher et aL (1 9 92 , E^c^Chem, 3:1 04-1 07) thereby 
releasing the targeted agent upon exposure to ,ight. Photocleavable linkers that 
a cleaved upon exposure to light are known (see, ^ Hazum et aL (1 981 , in 
P^Proc^Eu , Pept . Sym p 1fi I h> Brunfe|dt K (Ed)> pp _ io5 _ iiQ ^ 
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describes the use of a nitrobenzyl group as a photocleavable protective group for 
cysteine; Yen et al (1989) Makromol. Chem 190:69-82, which describes water 
soluble photocleavable copolymers, including hydroxypropylmethacrylamide 
copolymer, glycine copolymer, fluorescein copolymer and methylrhodamine 
■ copolymer; Goldmacher et aL (1 992) Bioconi. Chem 3-in*.im ... hirh dc . 
cribes a cross-linker and reagent that undergoes photolytic degradation upon 
exposure to near UV light (350 nm); and Senter et aL (1985) Photochem 
Pbotobiol 42:231-237, which describes nitrobenzyloxycarbonyl chloride cross 
linking reagents that produce photocleavable linkages), thereby releasing the 
targeted agent upon exposure to light. In preferred embodiments, the nucleic 
acid is immobilized using the photocleavable linker moiety that is cleaved during 
mass spectrometry. Exemplary photocleavable linkers are set forth in published 
International PCT application No. WO 98/20019. Bead linkers for immobilizing 
nucleic acids to solid supports are described in allowed U.S. application Serial 
No. 08/746,036 now U.S. Patent No. 5,900,481, and published International 
PCT application No. WO 98/20166 and WO 98/20020. 

Preferred applications include, but are not limited to, sequencing and 
diagnostics based on analysis of nucleic acids and polypeptides or diagnostics by 
mass spectrometry. Preferred mass spectrometry methods include ionization (I, 
techniques including, but not limited to, matrix assisted laser desorption 
(MALDI), continuous or pulsed electrospray (ESI) and related methods (e.g. 
lonspray or Thermospray), or massive cluster impact (MCI); the ion sources can 
be matched with detection formats including linear or non-linear reflectron time- 
of-flight (TOF), single or multiple quadruple, single or multiple magnetic sector, 
Fourier Transform ion cyclotron resonance (FTICR), ion trap, and combinations 
thereof (e.g., ion-trap/time-of-f light). For ionization, numerous 
matrix/wavelength combinations (MALDI) or solvent combinations (ESI) can be 
employed. DNA sequencing by mass spectrometry is described in U.S Patent 
No. 5,547,835; U.S. Patent No. 5,691,141; and related U.S. app.ication Serial 
Nos. 08/467,208, 08/481,033 and 08/617,010 and in PCT Patent App.ication 
Nos. Atty. Docket No. 24736-2007PC, filed December 15, 1998, published 
International PCT application Nos. WO 94/16101 and WO 97/37041. 
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DNA sequencing using mass spectrometry is described in U.S. Patent No. 
5,547,835. DNA sequencing by mass spectrometry via exonuclease 
degradation is described in allowed U.S. application Serial No. 08/744,590 U S 
Patent No. 5,622,824, published International PCT application No 
> PCT/US94/02938, U.S. Patent No. 5,851,765, and U.S. Patent No. 5,872 003 
Processes for direct sequencing during template amplification is described in 
allowed U.S. Patent Application No. 08/647,368 and published International PCT 
application No. WO 97/42348. 

DNA diagnostics based on mass spectrometry are described in U.S. 
Patent No. 5,605,798 and published International PCT application Nos WO 
96/29431 and WO 98/20019. Diagnostics based on mass spectrometry 
detection of translated target polypeptides are described in U.S. Application No. 
08/922,201 and published International PCT application No. WO 99/12040 
Mass spectrometric detection of polypeptides is described in U.S. Patent 
Application No. 08/922,201 and U.S. application Serial No. 09/146,054. 

It is understood that the nature of the sample to be analyzed and the 
analysis to be performed, as well as the feasibility of automating a reaction 
process, determine the methods used in the system, and the methods are not to 
be hmrted to the particular embodiments described herein. Any method and 
process that requires small volumes and involves one or more steps in the 
exemplified embodiment may be adapted and used in a system as described 
herein. 

Exemplary Embodiment 

One preferred embodiment, which is a dual space system, integrates 
nucleic acid amplification (via PGR,, immobilization of the nucleic acid on a solid 
support, followed by enzymatic reaction (e^, PROBE, mass array, sequencing 
nested PGR), sample conditioning, addition of an organic acid matrix for MALDI- 
TOF analysis and MALDI-TOF analysis on a microchip. This embodiment is 
descnbed with respect to the Automated Process Line (APL) system 100 
depicted in FIG. 1. As noted above, samples are initially prepared in a 
contamination-controHed environment 102, such as a clean room or laminar flow 
room, and are moved by a sterile transport chamber 104 or taxicab into a non- 
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sterile environment 106. In FIG. 1, samples are indicated by rectangular 
elements with criss-crossed lines. 

In the FIG. 1 embodiment, sample preparation begins in a Liquid Handling 
System 108, such as the Tecan "Genesis 200/8 Robotic Sample Processor" 
5 product. One or more samples 1 10 of purified genomic DNA are delivered by a 
robot 1 12 to 96-well or 384-we.l microtiter plates 1 14 in the Liquid Handling 
System 108, preferably using a 200 cm instrument width and an 8-tip arm. 
These sample processing steps occur in the contamination-controlled 
environment 102. Multiple samples may be included in the APL system for high- 
' throughput processing. These samples may, at times during processing, be held 
m a sample storage apparatus, such as the "Plate Cube" rack 116 available from 
Robocon. To the sample plates 1 14 are added a PCR reaction mix 118 
including PGR primers, where one of the primers is labeled at the 5' end with 
functionality, such as biotin. that can be used to immobilize the amp.icon to a 
so.,d support is added to the sample mixture. Where mu.tip.e samp.es are to be 
processed, a wash solution is contained in a reservoir 120 and is used to dean 
the p, pe tte tips to prevent cross-contamination between samp.es or reagents 
Alternative^, the APL system can process mu.tip.e samp.es using disposab.e " 
pipette tips. 

The sample plates are manipulated by a robotic system, for example the 
Robocon robot 112, such as the CRS A 255 Robot, which moves along a centra, 
track 122. The robot 112 operates under contro. of a c.ean room control system 
computer 124 that includes a centra, processing unit (CPU) 126, a operator 
■nterface 128, and an APL interface 130. The CPU can comprise any 
commercial avai.ab.e desktop computer, such as an IBM-compatib,e persona, 
computer (PC, or the ,ike. The operator interface 128 inc.udes a visua. display 
and keyboard or other device through which an operator provides commands. 
The APL interface 130 is an interface between the computer and the process 
l.ne, through which the computer 124 controls the robot. The APL interface 
may .nc.ude, for example, a robot control program installed in the computer 124 
and available from Robocon for contro, of its robot products. An optional second 
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computer 131 can assist the first computer 124 in performing clean room 
processing. 

The robotic arm is equipped with a gripper 132, such as the "Digital 
Servo Gripper" arm, also available from Robocon, to pick up and drop off the 
5 sample plates 114 as needed, for processing. In a particular embodiment, a 
microliter plate is aligned with the gripper so the plate receives pins 134 of the 
gripper, which more securely couple the plate with the gripper for more secure 
transport. 

FIG. 1 shows a sample plate 140, including the sample and PCR mix, that 
10 is moved to a turntable 142 and oriented such that the robot picks it up and 
moves it to a bar code reader 144, for example, as is available from Datalogic, 
where the bar code is read and recorded for sample tracking. Sample tracking 
and reorientation may be performed multiple times during sample processing to 
assist the robot in sample handling. 

The sample plate 140 is reoriented by the robotic arm, using the 
turntable, and is then placed in a lid parking station 146, such as is available as 
a robotic module in the Robocon robotic system. At the lid parking station, a lid 
may be parked or retrieved. In the preferred embodiment, the lid is a solid 
structure, such as a metal lid, with a flexible seal such that placing the lid on the 
plate seals the contents of the plate. The sealing eliminates evaporation during 
subsequent processing, such as PCR amplification. Such a sealing apparatus, 
known as "MJ Microseal", is available from MJ Research, Inc. Alternatively, 
after the sample plate is reoriented, it can be penetrably sealed. For example, 
the sample plate can be covered with a foil wrap that can later be penetrated by 
test probes or the like. A similar penetrable seal can be provided by a parafilm 
that is attached to the plate by heat, or other plastic or wax based sealers. 

The sealed sample plate is then picked up by the robotic gripper arm and 
transported from the laminar flow environment 102 into the taxicab transport 
station 104, which provides a sterile environment. First, an entry door opens in 
the taxicab to permit the robot to place the sample plate into the taxicab. Once 
in the taxicab 104, the entry door closes behind the sample to prevent 
contamination. Within the taxicab transport station 104, the sample plate is 
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placed onto and is transported along a pneumatically driven stage, and a second 
door opens to permit the sample to exit the taxicab into a non-sterile 
environment. Once outside the sterile taxicab environment, control of sample 
manipulation is transferred to a second robot 150, also equipped with a gripper 
5 152 and moving along a center track 153. The sample plate is transported by 
the robot 150 and is read by a second bar code reader 154 for sample tracking 
The second bar code reader 154, as well as a second turntable 156, lid park 
station 158 and sample storage rack 160 are included outside the 
contamination-controlled area 102 for more efficient sample handling. 

The robot 1 50 operates under control of a PCR Room computer 1 61 that 
has a construction similar to the Clean Room computer 124. Thus, the PCR 
Room computer 161 can comprise any commercially available desktop computer 
that can interface with the APL system process line and stations. 

After the sample identification code has been read by the bar code reader 
152, the sample plate is moved by the system robot 150 to a PCR station 162 
where amplification is carried out. The amplification reaction can be PCR ligase 
cha.n reaction, etc. In a preferred embodiment, the "MJR Tetrad" thermocyc.er 
ava.lable from MJ Research, Inc., is used for PCR amplification. Other PCR 
thermocycler systems are commonly known to those of skill in the art and may 
optionally be integrated into the system. Methods for DNA amplification are well 
known to those of ski.l in the art. Multiplex PCR can also be carried out using 
the system. 

After PCR amplification, the plates are removed from the PCR reaction 
station 162 by the robot 150. The plates are then moved to the lid park station 
158, where the lids are removed and unsea.ed. As noted above, however a 
Penetrable sea, such as a foil wrap or parafilm is an alternative to a lid sea,', and 
■f removable lids are not used to sea. the p.ates, then the lid park station is 
unnecessary and the next substance that must be added to the wells of the plate 
will be inserted upon piercing of the foil wrap. 

Alternatively, using a second liquid handling system 164, preferably a 
Tecan "Genesis 200/8" system, streptavidin-coated paramagnetic beads can be 
loaded from a reservoir 166 and mixed with the PCR-amplified DNA in the 
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sample plate, resulting in. immobilization of the amplicon via the nationalized 
(e.g. biotinylated) primer. Beads are used, for example, where the samples are 
contained in multiwell microliter plates. The beads and PCR products are 
reacted by shaking, using a shaking apparatus 168, such as is available from 
Robocon, and which is integrated into the APL system. 

The sample plates are then moved to a liquid handling and mixing station 
170, into which a magnetic lift station 172 has been incorporated, for post-PCR 
processing. In a preferred embodiment, the liquid handling station is a 
"Multimek 96" well pipetting station, available from Beckman. The magnetic lift 
applies magnets to the sample plate by moving the magnets up against the 
bottom of the sample plate, for example, by using a pneumatic lift, thereby 
immobilizing the DNA and beads, and the supernatant is removed. The magnets 
are then released and liquid is added to the wells to resuspend the sample. 
Alternatively, the sample plate could be moved, for example, by the robot to 
bring it into contact with the magnet. The magnet can be a solid surface that 
mteracts with the entire bottom of the sample plate, or can be designed to more 
specially interact with the individual samp.es. For example, where the sample 
plate ,s a 96-well microtiter plate, the magnet can be configured as 8 or 12 
•ndividual strips so that each strip comes into contact with the bottom of a 
20 single row of wells. 

Conventionally, the magnets of the magnet lift station 172 are elongated 
stnp magnets arranged in rows between sample wells. Alternatively, the 
magnets can be configured as individual point magnets, for example, as disk- 
shaped magnets arranged into an 8 x 1 2 grid of magnets that correspond to the 
pos.t.ons of the sample wells in a 96-wel. microtiter p. ate . This configuration 
provdes an advantage over the magnetic strip configuration, particularly where 
small volumes are to be added to the sample. For example, as illustrated in 
FIG. 2, where magnetic strips 202 are used with a multiwell microtiter p.ate 
204, the magnet strips are offset from the center of the sample wells 206 and 
magnet.c beads 208 concentrate along the sides of the wells. 

It is desirable that all beads be concentrated in a location such that added 
l.qu,d makes maximum contact with the samples. If, for example, a volume of 
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sample is removed from the wells and a smaller volume is to be subsequently 
added, the smaller volume might not be sufficient to wash all the beads from the 
side of the wells, and the sample concentration could be affected. FIG. 3 is a 
plan view of the alternative, preferred embodiment, and shows a portion of the 
construction that centers a disk-shaped point magnet 302 beneath the center of 
each sample well in a multiwell microtiter plate. For simplicity of illustration, 
only a 4X5 grid is shown. It should be apparent that by using individual point 
magnets at the bottom of the wells, the beads collect at the bottom of the wells 
and are more easily resuspended, particularly where a smaller volume of liquid is 
to be added. Multiple rounds of liquid handling are employed to allow for 
supernatant removal, denaturation of double stranded DNA, wash steps and the 
addition of enzymatic reaction reagents (PROBE). 

Returning to FIG. 1, a sample plate 176 is next moved by the robotic 
system to the lid park station 158, and sealed with a lid. This operation is 
optional and is used, for example, when the sample is subjected to high 
temperatures in order to prevent evaporation. The sample plate can otherwise 
remain open to the environment. 

The robot 150 moves the sample plate again to the PCR station 162 and 
places it into a thermocycler of the PCR station. The thermocycler carries out an 
enzymatic reaction. The enzymatic reaction can be, for example, PROBE, nested 
PCR, primer extension, or sequencing reactions (e.g. Sanger). Details for such 
enzymatic reactions are commonly known to those skilled in the art. 

After the reaction is complete, the sample plate is removed from the 
thermocycler of the PCR station 162 and then is returned to the lid park station 
158 by the robot 150, and the lids are removed and the plate unsealed. 

The sample plates are again moved to the liquid handling and mixing 
station 170 containing the magnetic lift station 172, which applies the magnets, 
immobilizing the beads and DNA. The liquid handling and mixing station then 
removes the supernatant. The magnets are then released and liquid is added to 
the wells. Multiple rounds of liquid handling are employed to allow for washing 
steps or treatment with ammonium citrate, TRIS, or any other reagent that 
removes salt ions and replaces them with ammonium ions, thereby conditioning 
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the samples prior to mass spectrometry. Once conditioned, the primer extension 
product is denatured from the immobilized DNA with ammonium hydroxide and 
released into the supernatant. The ammonium hydroxide reaction is performed 
for five minutes at approximately 60° F. The supernatant is removed to a clean 
• sample plate and placed on a shaker 168. 

The sample plate is next transported to a sample preparation station 178 
to prepare it for analysis. In a preferred embodiment, where MALDI-TOF mass 
spectral analysis is performed, nanoliter or smaller volumes of sample are 
dispensed onto pre-made silicon chips to form a microarray and reacted with 
matrix. In general, however, the sample may involve any preparation for use 
with any analytical method. Nanoliter or smaller volumes are dispensed using a 
nanoliter or lower dispensing apparatus, such as a piezoelectric pipette, including 
the "Nano-Plotter" station, available from GeSiM. Finally, the sample plate is 
transported to the analytical system, e^, a mass spectrometer or other 
spectrometry techniques, such as UV/V.S, IR, fluorescence, chemiluminescence 
or NMR spectrometry, where sample analysis is performed. 

Several alternatives are possible for preparing a sample for analysis and 
loading the sample into the analytical system. For example, three separate 
components, including a dispensing apparatus, a sample platform containing test 
samples, and an analytical instrument, can be integrated into the A PL system. 

In a preferred embodiment, a nanoliter dispensing apparatus ("Nano- 
Plotter") 180 of the sample preparation station 178 is used to prepare one or 
more samples for mass spectral (MS) analysis, preferably using MALDI-TOF MS 
In preparing a samp.e for MALDI-TOF analysis, the sample is co-crystallized with 
a matnx material. The samp.e is then loaded into a mass spectrometer 182 on a 
MS sample platform. Alternatively, the MS platform may be integrated into the 
mass spectrometer, rather than a separately-controlled component. The sample 
Platform can be adapted to hold one or more sample analysis vessels, such as 
microchips. 
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In another embodiment, the A PL system can carry out enzymology 
directly on the beads and can directly add matrix to the beads to analyze using 
mass spectrometry, where the DNA is ionized directly off the beads. This 
eliminates the need for a nanoliter dispensing station 1 78 such as the GeSiM 
"Nano-Plotter", rather, matrix is added with the liquid handling system 170. 

In a preferred embodiment, one or more microchips containing test 
samples are prepared by dispensing nanoliter volumes of a sample and an 
organic acid matrix onto a chip using a nanoliter dispensing apparatus 180, or a 
nanoliter dispensing apparatus, and loading the chips into a mass spectrometer 
182. Alternative embodiments are possible where (1) one or more test samples, 
JLSL, on sample chips, are prepared on a sample platform on the nanoliter 
dispensing apparatus, such as the Nano-Plotter, and the sample platform is then 
transferred, e^, by a robot, into the mass spectrometer; or (2) where one or 
more sample chips are prepared on the nanoliter dispensing apparatus, 
transferred to a mass spectrometer sample platform station 184 and then 
inserted into the mass spectrometer. 

In another embodiment, the APL system can carry out enzymology 
directly on a microchip by performing the steps of: 

1 ■ Aliquot genomic DNA and transfer to second chamber via taxi; 

2. PCR amplify the genomic DNA using previously described steps; 

3. Using a liquid handling apparatus (Tecan or GeSim) or pintool add 
DNA to microchip. The chips are held in a holder that can be 
manipulated by the robot; 

4. Add PCR reaction mix to chip; 

5. Incubate on thermocycler; 

6. Wash chip with liquid handling apparatus; 

7. Add matrix to chip; 

8. Load chip in MALDI; and 

9. lonization/Desorption directly from the chip via MALDI. 
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Mass Spectrometer Interface 

The nanoliter dispensing apparatus and mass spectrometer are integrated 
into the system 100 and communicate with each other, either directly or via a 
control computer. For example, in one embodiment, commands are 
automatically executed from a computer controller to initiate opening and closing 
of a mass spectrometer entry door (e^, by using pneumatics or a motor-driven 
mechanism) and to initiate loading of a MS sample platform into the 
spectrometer (e^, by using a robotic arm), where the platform is either loaded 
with sample chips directly on a nanoliter dispensing apparatus, such as the 
Nano-Plotter 180, or the sample chips are prepared on a nano-plotter 180 and 
then are transferred onto a sample platform 184. FIG. 4 shows one 
implementation of the robotic interface between the nanoliter dispensing 
apparatus and the mass spectrometer illustrated in FIG. 1. 

In the FIG. 4 embodiment, the samples are automatically transported from 
the sample preparation station 178 to the mass spectrometer 182 by a robotic 
arm system 410 (not shown in FIG. 1). As described above, the samples are 
prepared for the mass spectrometer 182 in the nanoliter dispensing apparatus 
180 and/or the sample platform station 184. When preparation is complete, an 
arm 412 rotates about a pivot base 414 to pick up the samples from the sample 
preparation station and then positions them at a sample entry station 416 of the 
mass spectrometer. 
Data Analysis 

Conventionally, the output of mass spectrometer testing is analyzed by 
an individual datum-by-datum, so that an individual examines the output of a 
sample test and makes a conclusion about the test, sample-by-sample. In the 
Automated Process Line (APL) described above, the volume of test results is 
sufficiently large that any individual analyzing the mass spectrometer output 
would quickly be unable to keep up with the APL output pace. The APL system 
of the preferred embodiment performs computer-automated analysis of mass 
spectrometer output data to determine genotype or make another analysis as 
quickly as the system produces test results. The data analysis can continue as 
long as the system is in operation, including on a round-the-clock, 24-hour basis. 
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The APL system performs the test output analysis by automatically processing 
the mass spectrum output data of a sample, comparing the output data against 
expected spectrum output values for different genotypes, producing a conclusion 
about the sample genotype based on a conclusion about most likely genotype for 
the sample, and continuing with the output data of the next sample. 

In the preferred embodiment illustrated in FIG. 1, the data analysis is 
performed by a dedicated data analysis computer 188 that receives output data 
from the mass spectrometer 182 and any other pertinent APL stations or 
components. The data analysis computer can comprise any commercially 
available desktop computer, and can have the same configuration and 
components as the Clean Room control computer 124 described above. Thus, 
the data analysis computer 188 includes a CPU having an operating environment 
in which programs are executed, and also includes an operator interface with a 
keyboard and a display. 

The process line 100 operates continuously until a stop command is 
received, for a high sample throughout. Therefore, the process line provides for 
emergency situations where an immediate halt is required by providing halt 
switches 198 placed around the line. The system also can be halted by a 
software halt command that is input by an operator at any of the control 
computers 124, 131, 161, 188. The sample preparation, testing, and data 
analysis otherwise continues unimpeded. 

A visual display of the data analysis is depicted in FIG. 5, which shows 
from top to bottom: a graph of two exemplary test spectra against which output 
data will be compared; a graph of output data picked peaks for analysis; and a 
graph of smoothed spectrum data. Those skilled in the art will appreciate that 
the spectra shown in FIG. 5 correspond to multiple graphs of mass spectrometer 
output, wherein the horizontal axis (x-axis) units are in mass per unit charge, 
also referred to as units of Daltons, and the vertical axis (y-axis) is in relative 
intensity of spectrometer discharge. 

The exemplary spectra shown in FIG. 5 relate to male-female genotypes, 
but those skilled in the art will appreciate that any other paired-outcome typing 
decisions may be the subject of the sample analysis. 
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In FIG. 5, the first test spectra is labeled "Test-Female" and corresponds 
to output spectra that might be expected from a female test subject. The 
second test spectra is labeled "Test-Male" and corresponds to output spectra 
that might be expected from a male test subject. Thus, the object of the APL 
5 processing will be to determine whether a given sample genotype belongs to a 
female subject or a male subject. The "Picked Peaks" of FIG. 5 spectra is a 
display of the mass spectrometer output for a particular sample over a 
predetermined range, to show particular output peaks. The output peaks shown 
in the Picked Peaks graph are selected by the APL system based on input 
10 parameters supplied by the APL operator, as described further below. The 
bottom spectra of FIG. 5 is a display of the spectra output after correction 
processing initiated by the APL system. It should be understood that the Test- 
Female and Test-Male graphs of the FIG. 5 display will not change as the APL 
system processes the mass spectrometer output data, while the Picked Peaks 
and Smoothed Spectrum graphs are different for each sample data, and therefore 
will generally change with each sample being processed. It also should be 
understood that the Picked Peaks and Smoothed Spectrum displays can be 
stopped on any one of the output graphs, if the operator wants to view one 
particular set of graphs. FIG. 6 is a flow diagram of the operating steps 
20 performed by the APL system in carrying out the mass spectrometer data 
analysis, and will be best understood with reference to the FIG. 5 graphs. 

The first data analysis step, represented in FIG. 6 by the flow diagram 
box numbered 602, is to receive test run input parameters. These are 
parameters that the APL system will receive from an operator and will apply in 
processing a run of mass spectrometer output data. That is, the APL system 
will use the test run input parameters to evaluate test samples until the test run 
parameters are changed by the APL operator. As noted above, a test run might 
involve producing mass spectrometer output and analyzing it on a 24-hours-per- 
day basis. In the preferred embodiment, the operator provides the test run 
parameters through a graphical user interface using a display mouse and 
keyboard of the APL system. The test run input parameters received from the 
operator will include the x-axis range in Daltons for the spectrometer output data 
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and x-axis locations of expected peaks that are picked for data identification and 
genotype evaluation. The input parameters will also include an expected 
baseline value, defining a noise floor above which data should comprise a peak. 
In the next processing step, represented by the FIG. 6 flow diagram box 
5 numbered 604, test data is received for a particular test sample submitted to the 
mass spectrometer of the APL system. A particular test sample may be one well 
in a 96-well-by-96-well tray, for example. Other tray sizes may be 
accommodated by the APL. 

Those skilled in the art will understand that a mass spectrometer 
10 bombards a crystalline-based sample with energy until the sample vaporizes and 
output products are produced. The output products include intact sample 
particles that are ionized and projected outwardly to different distances from the 
sample center. The mass spectrometer detects the distribution of output 
products having a particular mass per unit charge and assigns a relative intensity 
to those output products. The mass/charge units are given in Daltons or 
kiloDaltons (kD). Thus, the mass spectrometer output for a given sample is a 
sequence of paired numbers, or x-y values, that specify the detected 
mass/charge over a range of Daltons (x-axis) and the corresponding relative 
intensity (y-axis) distribution over that range. 

For each set of sample data that is processed, the APL system removes 
the residual baseline. This processing is represented by the FIG. 6 flow diagram 
box numbered 606, and allows for a rolling baseline that might otherwise skew 
the output data. More particularly, with current processing systems, it is 
possible to misinterpret peaks or spikes, such as where true data peaks are 
located in valleys. Conventional programs identify peaks by detecting data 
intensity values (see FIG. 5) that are greater than a baseline value. The data, 
however, can contain localized areas in which a peak lies within a valley of a 
Plateau area having an elevated baseline. Peaks that are in such valleys may be 
missed by conventional programs that do not detect a sufficient difference 
between the peak height relative to the plateau level. It has been found that 
such conventional programs may correctly identify peaks up to 80% of the time, 
but cannot generally provide greater accuracy due to missed peaks. 
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the output, for primer output artifacts. Therefore, a total of three peaks will be 
expected in the mass spectrometer output over the range of interest. Then the 
peak-free regions would be those regions in the output data along the x-axis over 
the range of interest, with the data at the three identified peaks deleted. As 
5 noted above, the peaks are assumed to be gaussian, with a width value 

specified in the input parameters. Therefore, the data for deletion comprises the 
peaks identified in the test run input parameters and also an area two peak 
widths wide on either side of each identified peak (peak midline, +/- two peak 
widths). 

10 it is the mass spectrometer output data with the peaks deleted that gives 

the peak-free region, to which the quadratic equation is fitted. Typically, the 
variable quadratic coefficients would be small, but it is possible to get 
contamination from the lower-mass sample particles, which can skew the 
output. If such contamination is present in the output, then the sample output 
15 may be skewed so that the peak free regions will be best modeled by a quadratic 
equation. It has been found that contamination products are best modeled with 
a quadratic equation, rather than a linear, cubic, or other type of equation. 

The technique for determining the coefficients of the quadratic equation 
for the best fit to a peak-free baseline is preferably a least squares fit technique, 
20 which will be well-known to those skilled in the art. In particular, error 

minimization using gradient information has been found suitable for the least 
squares fit. Thus, the curve-fit quadratic baseline equation can be used to 
produce an expected baseline over the mass spectrometer output range of 
interest. Therefore, as part of the baseline correction processing represented by 
25 the FIG. 6 flow diagram box numbered 606, at each data point interval along the 
range of interest (e.g., from 4000 to 9000 Daltons), the curve-fit baseline 
equation is used to calculate a corrected baseline value, which is subtracted 
from the sample data. The baseline correction occurs over the entire data range, 
including at the peaks. This produces a new set of baseline-corrected sample 
30 data values, i.e., a baseline-corrected output spectrum. 

In the next processing step, represented by the FIG. 6 flow diagram box 
numbered 608, a curve is fit to each baseline-corrected peak value in the mass 
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spectrometer output data. In the preferred embodiment, a standard curve fitting 
algorithm is used, such as the Marquardt-Levenberg algorithm. This fits a 
gaussian curve to each possible baseline-corrected output peak position. Those 
skilled in the art will understand that the output of such curve fitting will provide 
coefficients of a gaussian distribution centered at each peak that will match the 
height of the baseline-corrected output data at that peak, and will also provide 
the covariance of the curve-fit height. Thus, the box 308 curve fitting will 
provide, for each peak, equation coefficients that give a peak height and a 
covariance for the equation at that peak. 

In a preferred embodiment, the "Picked Peaks" graph in FIG. 5 represents 
all peaks in the mass spectrometer output that have a height that exceeds the 
baseline corrected value generated by the box 606 processing, using peaks that 
are modeled from the box 608 processing. Alternatively, the Picked Peaks graph 
may represent the peaks in the actual mass spectrometer output that exceed an 
input threshold value. This latter type of Picked Peaks graph display is the type 
that is typically provided by mass spectrometer manufacturers, such as 
Bruker-Franzen Analytik GmbH ("Bruker") of Germany. In the preferred 
embodiment, the "Smoothed Spectrum" graph of FIG. 5 represents the output 
from the mass spectrometer with default data processing, which may include 
curve smoothing or other data processing provided by the mass spectrometer 
manufacturer. This type of Smoothed Spectrum graph is provided, for example, 
as standard output from the Bruker mass spectrometer. Alternatively, the 
Smoothed Spectrum graph may represent the mass spectrometer output with the 
baseline threshold parameter subtracted, or the actual mass spectrometer output 
25 with the quadratic-fit baseline curve subtracted. 

In the next processing step, represented by the FIG. 6 flow diagram box 
numbered 610, the APL system determines the probability that the output data 
at each identified peak location is a valid peak. In the preferred embodiment, the 
peak validation decision is made by comparing probability density functions 
(PDF) for the peak-free region and for the fitted peak by constructing gaussian 
(or normal) probability curves and comparing them to determine if the data 
overlaps. If the two curves (the fitted peak and the peak-free region) are 
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substantially free of any overlap, then the APL assumes that a true peak has 
been identified. Otherwise, the fitted "peak" is considered a spurious datum in 
the noise of the mass spectrometer output. 

More particularly, the PDF of the peak-free region is assumed to be a 
gaussian distribution. The mean height and the standard deviation are 
determined by the mass spectrometer output for the sample in question. The 
PDF at each identified peak location is assumed to be a gaussian distribution 
with the mean height and the standard deviation given by the curve fitting 
algorithm described in box 308. The second gaussian curve will be determined 
once for each peak. The degree to which the two curves resemble each other is 
compared statistically using hypothesis testing that will be well-known to those 
skilled in the art. The output of the hypothesis test will be a probability value 
(from zero to one) that characterizes the peak under consideration. Thus, each 
peak is assumed to be an independent statistical event. 

For example, the comparison uses the baseline curve, which is a 
quadratic model (peak-free region) having a particular mean height arid 
corresponding standard deviation. The comparison also uses the gaussian model 
of each peak, having a mean height and standard deviation. If the mean values 
of the two respective curves are different by more than two standard deviations, 
then it is assumed there is no overlap for purposes of peak validation. That is, 
the test peak is a valid peak. If the two curves are not different in mean by 
more than two standard deviations, then the identified peak is not a valid peak, 
but is part of the output noise. 

After the APL system evaluates the probability for all of the peaks, it will 
know the number of peaks that have been identified as valid. The system then 
determines probabilities for the genotypes under consideration. The APL system 
makes a data typing decision based on the presence or absence of sufficient true 
or validated peaks to indicate one genotype or the other. This processing is 
indicated in FIG. 6 by the flow diagram box numbered 612, and is carried out in 
30 a probabilistic manner. 

For example, suppose a sample is to be typed as either female or male, 
and a female is indicated by the presence of an output peak at a position "A" 
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and the absence of an output peak at a position "B\ while a male is indicated by 
a peak at position "A" and also at position "B". Then the probability of a sample 
being female is the product of the probability of a true peak occurring at A and 
the probability of a peak not occurring at B. Stated in equation form: 

P(female) = P(A) * (1 - P(B)). 
The probability of a sample being male is then the product of the probability of a 
true peak occurring at position A and the probability of a true peak occurring at 
position B, given by the equation: 

P(male) = P(A) * P(B). 
This analysis is performed automatically by the APL system for each of the 
samples processed by the mass spectrometer. Based on these probabilities, the 
APL system decides whether the mass spectrometer output identifies a male or a 
female. If the probabilities indicate an ambiguous outcome, then the mass 
spectrometer output is considered inconclusive. In the preferred embodiment, a 
probability is considered conclusive if it is at least ten times the probability of the 
alternative outcome. Thus, if P{female) is greater than ten times P(male), then 
the typing decision is for a female. If P(male) > 10 * P(female), then the typing 
decision is for a male. 

After the analysis has been performed for a sample subject, the APL 
system checks for additional mass spectrometer output for analysis. As noted 
above, the APL system can support mass spectrometer output at the rate of 
hundreds of output sets per hour. As indicated by the decision box 614 in FIG. 
6, if more data is present, an affirmative outcome at box 614, then APL control 
resumes with receiving the next set of output data at the flow diagram box 
numbered 304. If there is no more mass spectrometer output data for analysis 
or if a system operator indicates a halt command, a negative outcome at box 
614, then the sample run ends and other operation of the APL continues. For 
example, operation may return to box 602, where more test run input 
parameters are received and output analysis is resumed. Other processing may 
30 occur, as desired. 
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Databases 

In cases of high-throughput, the system stores results of all samples in all 
runs , n a database. The sample run history may be se.ected for viewing through 
a user interface, such as, but but are not limited to, that illustrated in FIG 7 
5 The user interface permits review of the database created by one or more 

sample runs. An example of the user interface to such a database is shown in 
the screen display of FIG. 8. The database provides a means of obtaining test 
output, reaction details, sample details, and assay details for each sample under 
test. For example, shown as output collected in the database are the sample 
Plate number, location of the samp.e we.l, sample and p,ate IDs, name, result of 
genotype matching, and actual spectrum for each sample. 

A database analysis system is also integrated into the APL system (see 
FIG. 7, and permits a user to (i, create a new run; (2, copy an existing run; (3, 
ed.t or v.ew an existing run; (4, change status or add comment; (5) view the 
h-story of a run; and (6) create or edit an assay or test, .n the preferred 
embodiment, the database is supported by a database management system from 
Oracle Corporation. 

The systems may a.so draw from data available in databases or collected 
■n databases. Of interest and exemplary, though not limiting, thereof, is the 
healthy patient database described in copending U.S. provisional application 
Serial No. 60/159,176, filed October 13, 1999. 

The processes, systems, and products provided herein have been 
descnbed above in terms of a presently preferred embodiment. There are 
however, many configurations for automated high throughput systems that 
.nclude processing stations not .pacifically described herein but that are 
apparent from the disclosure herein. The disclosure herein is not Nmited to the 
part.cu.ar embodiments described herein, but rather, is understood to have wide 
appl.cab.litv with respect to automated process lines generally, particu.ar.y in the 
areas of diagnostics and high throughput screening protocols 

Since modifications wil, be apparent to those of ski,, in this art, it is intended 
that th.s invention be .imited only by the scope of the appended Cairns 
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CLAIMS: 

1 . An automated system for high throughput processing of biological 
samples, the system comprising: 

a plurality of processing stations, each of which performs a procedure on 
5 a biological sample contained in a reaction vessel, wherein one of the processing 
stations comprises a mass spectrometer; 

a robotic system that transports the reaction vessel from processing 

station to processing station; 
a data analysis system that receives test results of the processing 
10 stations and automatically processes the test results to make a 

determination regarding the biological sample in the reaction 
vessel; and 

a control system that determines when the test at each processing 

station is complete and, in response, moves the reaction vessel to 
15 the next test station ' and continuously processes reaction vessels 

one after another until the control system receives a stop 
instruction, 

2. A system of claim 1 , wherein the reaction vessel comprises a 
multiple- well substrate or a solid support or both. 

20 3 ' A System of claim 1 or claim 2, further including a mass 

spectrometer interface that automatically transfers samples into the mass 
spectrometer for processing. 

4. A system of any of claims 1-3, wherein the data analysis system 
processes the test results from the mass spectrometer and makes a data typing 

25 decision regarding the biological sample. 

5. A system of any of claims 1-4, wherein the data analysis system 
processes the test results by receiving test data from the mass spectrometer 
such that the test data for a biological sample contains one or more peaks 
whereupon the data analysis system removes a residual baseline from the 'test 

30 data for a biological sample, curve fits each peak of the biological sample test 
data to predetermined input parameters, determines a probability that each peak 
of the b,ologica. sample test data is a valid peak, and makes a data typing 
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decision regarding the biological sample in accordance with the determined valid 
peaks. 

6. A system of any of claims 1-5, wherein the data analysis system 
d.splays exemplary test spectra for data types determined by the data analysis 

5 system, along with a graph of test data picked peaks and a graph of smoothed 
test spectra data for a biological sample. 

7. A system of any of claims 1 -6, wherein the data analysis system 
rece.ves test run input parameters that determine processing until a different set 
of input parameters is received. 

10 8. A system of any of claims 1-7, wherein the data analysis system 

removes the residual base.ine from the test data by modeling the baseline of the 
mass spectrometer data with a quadratic equation specified by the input 
parameters. 

9. A system of claim 8, wherein the input parameters specify a range 
15 of data over which the baseline is modeled. 

1 0. A system of claim 9, wherein the baseline is modeled over a peak 
free region specified by the input parameters. 

11. A system of claim 7 or claim 8, wherein the picked peaks graph 
represents al, peaks in the mass spectrometer output that have a height that 

20 exceeds the residual baseline corrected data. 

1 2. A system of claim 1 1 , wherein the data analysis system validates 
a peak after comparing a probability density function for the peak free region 
w.th a probability density function for a fitted peak if the comparison shows that 
the respective probability density functions overlap by a predetermined amount 

13. A system of any of claims 1-12. wherein the process line includes 
a contamination-controNed environment and a non-sterile environment, each 
env.ronment containing one or more processing stations. 

14. An automated system for high throughput processing of biological 
samples, the system comprising: 

30 a process line comprising a p.ura.ity of processing stations, each of which 

performs a procedure on a biological sample contained in or on a 

reaction vessel; 
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a robotic system that transports the reaction vessel from processing 
station to processing station; 

a data analysis system that receives test results of the process line and 
automatically processes the test results to make a determination 
5 regarding the biological sample in the reaction vessel; 

a control system that determines when the test at each processing 

station is complete and, in response, moves the reaction vessel to 
the next test station, and continuously processes reaction vessels 
one after another until the control system receives a stop 
1 0 instruction; and 

a contamination-controlled environment and a non-sterile 

environment, each environment containing one or more processing 
stations. 

15. The system of any of claim 1 3 or claim 1 4, further comprising a 
1 5 taxicab that automatically transports samples between the two environments. 

1 6. A method for high throughput processing of biological samples, 
the method comprising: 

transporting a reaction vessel along a process line of the system of any of 
claims 1-15 having a plurality of processing stations, each of 
20 which Performs a procedure on one or more biological samples 

contained in the reaction vessel; 
determining when the test procedure at each processing station is 

complete and, in response, moving the reaction vessel to the next 
processing station; 

25 receiving test results of the process line and automatically processing the 

test results to make a data analysis determination regarding the 
biological samples in the reaction vessel; and 
processing reaction vessels continuously one after another until receiving 
a stop instruction. 

30 1 7. The method of claim 1 6, wherein the step of transporting includes 

automatically transferring samples into a mass spectrometer for processing using 
a robotic mass spectrometer interface. 



WO 00/60361 



PCT/USOO/08111 



-44- 



compnses: 



18. A method of Cairn 16, wherein the step of receiving test results 
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receiving test data f rom the mass spectrometer such that the test data for 

a b,olo gi cal sample contains one or more peaks; 

5 removing a residual baseline from the test data fnr * hL ■ . 

iesT aata Tor a biological sample- 
curve fitting each peak 0 f the biological sample test data to 

predetermined input parameters; 

determining a probability that each peak of the biologica, sample test data 
is a valtd peak; and 

10 making . data typi „ 9 declsion ragardjng ^ ^ 

accordance with the determined valid peaks. 
19. A method of cairn , 6, further including the s ,ep of displaying 
exemplary tes, spectra ,o, data types to be determined by the data analysi 
system, along with . graph „, ,. s , da , a picked paaks an<J . 

15 test spectra data for a biological sample. 

tes, J-no , * me ' h ° d " aim ' ^ Wh ™''" ,hS dM a " alvsis '-Ives 

parameters are received. 

displavinl' * 7 h ° d °' ain1 Wherei " ** S ' eP °' ***** ""■*■« 
d.s laymg exemplary tes, spectra for dat, ,yp. s to be deterrninea „ y data 

analyse system, along with a graph of test 

data picked peaks and a graph o, smoothed tes, spectra data for a biologica, 
sample, and the input parameters specify display parameters 

22 A method of Cairn 1 6. wherein ,h, step of removing residua, 
baseline „om the tes, data comprises modeling the base.ine o, the mass 

spectrometer data with a nimHratio 

wth a quadratic equation specified by the input parameters. 

ran ^ ''^ """^ ^ inPUt Para ™ ers «P-cifV a 

range of data over which the baseline will be modeled 

peak fre? ■ A ^ " ^ ^ ^ DaSe ' ine iS ™ de '^ °ver a 

peak free reg,on specified by the input parameters. 
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25. A method of claim 21 , wherein the picked peaks graph represents 
all peaks in the mass spectrometer output that have a height that exceeds the 
residual baseline corrected data. 

26. A method of claim 25. wherein the data analysis system validates 
» a peak after comparing a probability density function for the peak free region 

with a probability density function for a fitted peak if the comparison shows that 

the respective probability density functions overlap by a predetermined amount. 

27. A method of claim 1 6, wherein the process line includes a 
contamination-controlled environment and a non-sterile environment, and the 
step of transporting includes automatically transporting samples between the 
two environments in a sterile taxicab. 

28. A data analysis system comprising: 

a computer having an operating environment that executes a data 

analysis program for processing test results from a process line of 
any of claims 1-15 haing a plurality of processing stations, each of 
which performs a procedure on a biological sample contained in or 

on a reaction vessel; and 

a computer interface that receives the test results from the process line 
and provides the test results to the data analysis program- 
wherein the data analysis program automatically processes the test results to 
make a determination regarding the biological sample in the reaction vessel and 
cont.nuously performs such processing for biological samples until a stop 
instruction is received. 

29. A data analysis system of claim 28. wherein the data analysis 
system processes the test results by receiving test data from the mass 
spectrometer such that the test data for a biological sample contains one or 
more peaks, whereupon the data analysis system removes a residua, baseline 
from the test data for a biological sample, curve fits each peak of the biological 
sample test data to predetermined input parameters, determines a probability 
that each peak of the bio.ogical sample test data is a va.id peak, and makes a 
data typmg decision regarding the biological sample in accordance with the 
determined valid peaks. 
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30. A data analysis system of claim 28 or claim 29, wherein the data 
analysis system displays exemplary test spectra for data types to be determined 
by the data analysis system, along with a graph of test data picked peaks and a 
graph of smoothed test spectra data for a biological sample. 
5 31 . A data analysis system of any of claim 28 or claim 29, wherein 

the data analysis system receives test run input parameters that determine 
processing until a different set of input parameters are received. 

32. A data analysis system of claim 3 1 , wherein the data analysis 
system displays exemplary test spectra for data types to be determined by the 

10 data analysis system, along with a graph of test data picked peaks and a graph 
of smoothed test spectra data for a biological sample, and the input parameters 
specify display parameters. 

33. A data analysis system of claim 28, wherein the data analysis 
system removes the residual baseline from the test data by modeling the 

1 5 baseline of the mass spectrometer data with a quadratic equation specified by 
the input parameters. 

34. A data analysis system of claim 33, wherein the input parameters 
specify a range of data over which the baseline will be modeled. 

35. A data analysis system of claim 34, wherein the baseline is 
20 modeled over a peak free region specified by the input parameters. 

36. A data analysis system of claim 35, wherein the picked peaks 
graph represents all peaks in the mass spectrometer output that have a height 
that exceeds the residual baseline corrected data. 

37. A data analysis system of claim 36, wherein the data analysis 
25 system validates a peak after comparing a probability density function for the 

peak free region with a probability density function for a fitted peak if the 
comparison shows that the respective probability density functions overlap by a 
predetermined amount. 

38. A method for high throughput processing of biological samples, 
30 the method comprising: 

transporting a reaction vessel along a process line having a processing 
station that performs a mass spectrometer test procedure on one 
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or more biological samples contained in the reaction vessel- 
providing the reaction vessel to the mass spectrometer and performing 

the mass spectrometer test; and 
continuous.y providing reaction vessels to the mass spectrometer and 

receiving test results of the mass spectrometer and automatically 
processing the test results to make a determination regarding a 
characteristic of the biological samples in the reaction vessel, 
wherein the characteristic is the biological sample genotype ' 

39. A method of claim 38, wherein the reaction vessel comprises a 
10 multiple-well sample tray or solid support 

40. A method of claim 38 or 39, wherein the step of continuous.y 
providing reaction vessels to the mass spectrometer comprises automatically 
transferring samples into the mass spectrometer for processing using a robotic 
mass spectrometer interface. 

« 41 . A method of any of claims 38-40, wherein the step of receiving 

test results comprises: 

receiving test data from the mass spectrometer such that the test data for 

a b.olog.cal sample contains one or more peaks; 

removing a residual baseline from the test dat* for » kLi • , 

win me tesi aata tor a biological sample- 

curve fitting each peak of the biological sample test data to 
predetermined input parameters; 

determining a probability that each peak of the biological sample test data 

is a valid peak; and 

making a data typing decision regarding the biological sample in 
accordance with the determined valid peaks. 

42. A method of any of claims 38-41 , further including the step of 
d.sp.ay,n 9 exemplary test spectra for data types to be determined by the data 
ana.ysis system, a.ong with a graph of test data picked peaks and a graph of 
smoothed test spectra data for a biological sample. 

43. A method of any of claims 38-42, wherein the data analysis 
system receives test run input parameters that determine processing until a 
different set of input parameters are received. 
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44. A method of claim 43, wherein the step of displaying comprises 
displaying exemplary test spectra for data types to be determined by the data 
analysis system, along with a graph of test data picked peaks and a graph of 
smoothed test spectra data for a biological sample, and the input parameters 

5 specify display parameters. 

45. A method of claim 41 , wherein the step of removing the residual 
baseline from the test data by modeling the baseline of the mass spectrometer 
data with a quadratic equation specified by the input parameters. 

46. A method of claim 45, wherein the input parameters specify a 
10 range of data over which the baseline will be modeled. 

47. A method of claim 46, wherein the baseline is modeled over a 
peak free region specified by the input parameters. 

48. A method of claim 44, wherein the picked peaks graph represents 
all peaks in the mass spectrometer output that have a height that exceeds the 

15 residual baseline corrected data. 

49. A method of claim 48, wherein the data analysis system validates 
a peak after comparing a probability density function for the peak free region 
with a probability density function for a fitted peak if the comparison shows that 
the respective probability density functions overlap by a predetermined amount. 

50. A method of claim 38-49, wherein the process line includes a 
contamination-controlled environment and a non-sterile environment. 

51 . The method of claim 50, wherein the step of transporting includes 
automatically transporting samples between the two environments in a sterile 
taxicab. 

52. The system of any of claims 13-15 that occupies two rooms, 
wherein the components in each room are linked by an automated sample 
transporter. 

53. The system of claim 52, wherein one room is a clean room. 
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